Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task
نویسندگان
چکیده
This paper presents speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models with a language model(LM) with a 60,000 vocabulary size in recognition of spoken queries. As model combination techniques, we use the SVM learning. We show that the techniques of multiple LVCSR model combination can achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. Comparing with the retrieval accuracies when a LM with a 20,000/60,000 vocabulary size is used in LVCSRs, the LM that has larger size of the vocabulary improves also retrieval accuracies.
منابع مشابه
Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task
This paper presents speech-driven Web retrieval models which accept spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speechdriven Web retrieval. We experimentally evaluated the techniques of combining outputs of multiple LVCSRmodels in recognition...
متن کاملEvaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task
This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognitio...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملUnsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems
This paper describes an accurate unsupervised speaker adaptation method for lecture speech recognition using multiple LVCSRs. In an unsupervised speaker adaptation framework, the improvement of recognition performance by adapting acoustic models greatly depends on the accuracy of labels such as phonemes and syllables. Therefore, extraction of the adaptation data guided by the confidence measure...
متن کاملEvaluating Speech-Driven IR in the NTCIR-3 Web Retrieval Task
Speech recognition has of late become a practical technology for real world applications. For the purpose of research and development in speech-driven retrieval, which facilitates retrieving information with spoken queries, we organized the speech-driven retrieval subtask in the NTCIR-3 Web retrieval task. Search topics for the Web retrieval main task were dictated by ten speakers and recorded ...
متن کامل